A dynamic hybrid model based on wavelet 
and fuzzy regression for time series estimation 



Olfa Zaafrane 

Department of Quantitative Methods, Faculty of Economic Sciences and 
Management, Sidi Messaoud, 5111 Hiboun, Mahdia, Tunisia. 

Anouar Ben Mabrouk ^ 

Computational Mathematics Laboratory, Department of Mathematics, Faculty of 

Sciences, 5019 Monastir, Tunisia. 



Abstract 

In the present paper, a fuzzy logic based method is combined with wavelet decompo- 
sition to develop a step-by-step dynamic hybrid model for the estimation of financial 
time series. Empirical tests on fuzzy regression, wavelet decomposition as well as 
the new hybrid model are conducted on the well known SP500 index financial time 
series. The empirical tests show an efficiency of the hybrid model. 
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1 Introduction 



The study of time series is an interesting task especially in financial contexts 
such as modehng, estimating, approximating and prediction. It necessitates 
a precise and deep comprehension of the series characteristics for a suitable 
choice of the model to be applied. The estimation process guarantees the de- 
tection of passed disfunction causes and therefore, it helps to take the eventual 
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and possible precautions at the suitable time. A fine and preventive analysis 
guarantees a good preparation for the future and a robust prediction in front 
of random breaks and non anticipated changes. Financial time series are for 
example, are characterized by very specific stylized facts where a respect with 
estimation method proves its efficiency. Observing the distribution tail, for the 
leptokurtic cases always evaluated by the kurtosis, the series values far from 
the mean of the series appears with probabilities that overcome the normal 
distribution. In financial case, the studies have shown that the tail distribu- 
tion is not leptokurtic but in the contrary, it has a kurtosis exceeds the normal 
case. Furthermore, observing the volatility clustering, financial time series are 
characterized by complex combinations of components with high frequencies. 
These facts are somehow due to the presence of the random or stochastic be- 
havior of the markets. Besides, the market may be characterized by infinite 
volatility allowing long memory process. This induces the appearing of scaling 
law invariance on the volatility (Walter, 2001). Indeed, Walter expects that the 
conciliation between absence of long memory on profitability and its presence 
on volatility is a modeling financial problem. Due to these facts, some classical 
methods have been classified as incapable to analyze financial series. ARCH 
and GARCH models did not take into account the kurtosis degree of the series. 
Furthermore, ARCH model and its terminologies have attained their limits in 
the field of financial modeling due to the fact that the scaling law in volatility 
has not been included in the model. (See also Walter, 2001). For this aim, 
researchers in financial time series field have thought to introduce other meth- 
ods that may induce more efficient models and to understand some aspects of 
non stationary, auto-regression, filtering, support vector machine models and 
prediction, neural networks models and predicting. Sec (Angue, 2007), (Az- 
izieh, 2002), (Ben Mabrouk et al 2008a,b), (Ben Mabrouk et al 2008), (Ben 
Mabrouk et al 2010), (Ben Mabrouk et al, 2011), (Chang et al, 2001), (Chen 
et al, 2006), (Chou, 2005), (Klir et al, 1995), (He et al, 2007), (Khashei et al, 
2008), (Kim et al, 1996), (Mitra et al, 2004), (Podobnik et al, 2004), (Ramsey, 
1999), (Struzik, 2000), (Tanaka et al, 1982), (Tseng et al, 1999), (Tseng et al, 
2001), (Wang et al, 2000), (Watada, 1992), (Wu et al,2002), (Zopoundis et al, 
2001). 

In the present paper, one aim is to apply wavelet theory and fuzzy logic 
theory to develop an estimation model for financial series. We search to judge 
the efficiency of fuzzy regression to estimate financial series. Next, we apply 
the discrete wavelet decomposition which improve especially the study of the 
local behavior of the series. Comparing the two methods of estimation, we 
have discovered that an hybrid model combining wavelet estimation with fuzzy 
logic estimation is possible. We then developed such a model which takes into 
account the non stationary behavior of the series as well as its local fluctuations 
and its fuzzy characteristics. The model combines wavelet decomposition with 
fuzzy regression. Next, an empirical study based on the famous SP500 index 
is provided in order to improve the theoretical parts. 
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The present paper is organized as follows. A first section is devoted to the 
presentation of the series characteristics. Section 2 is devoted to the develop- 
ment of the fuzzy regression model for the estimation of financial time series. 
In section 3, a wavelet analysis of time series is provided. In section 4, the 
hybrid model deduced by combining fuzzy logic with wavelet decomposition 
is developed. Finally, an empirical study on the SP500 index is developed in 
section 5 leading to a comparison between the different models and improving 
the impact of the hybrid scheme. 



2 The Data Description 

In the present paper, we propose to study the behavior of the well known 
financial index SP500 which is a stock index describing the fiuctuations of 
the stock capitahzation due to the 500 most large economic societies of the 
American stock. It is composed of a number of 380 industrial firms, 73 finan- 
cial societies, 37 public service firms, and 10 transport ones. The choice of 
such an index is motivated essentially by its central role as a measure of the 
American economy performance. Besides, the international financial integra- 
tion is often increasing which forces the international exchanged productions 
to be strongly related. So that, as the American market is the center of inter- 
national transactions, any variation of its index such as SP500 immediately 
affects on other external markets. Furthermore, the study of the USA market 
index is of interest nowadays due to the financial international crisis which 
has been started from this market and next affected the world-wise markets. 
So, searching a good solution to understand the crisis is of priority. 

The data basis consists of 5'P500 index monthly values during the period from 
August 1998 to March 2009 allowing a basis of size = 128 = 2''. We applied 
the log- values of the series in order to reduce the range of the series. The 
statistic characteristics of the series are resumed in the following table. 



Sample size, N = 


128 


Mean 


7.0921 


Variance 


0.0246 


Maximum 


7.3456 


Minimum 


6.6000 


Kurtosis 


3.2229 


skewness 


-0.9017 



Table 1 

Statistic Characteristics 
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We notice a kurtosis value over-crossing the normal value 3 which means 
that the series is leptokurtic. The skewness of the series induces a negative 
value which means that the data are spread out more to the left relatively to 
the means of the series than to the right. The following figure represents the 
original series S{t) = \og{SP(t)), where SP{t) is the corresponding value of 
the index SP500 at the month t. 



s 




Fig. 1. Original Series S{t) 



3 A fuzzy regression model 

The reasons behind the test of fuzzy regression for modeling financial series 
has many justifications. Firstly, financial series have always an ambiguous rela- 
tion concerning dependent variables and independent one; The time variable 
here. Such an ambiguity is not taken into account in almost all statistical 
methods, but in the contrary they assume that the behavior is always defi- 
nite. Furthermore, financial series such as SP500 are already fluctuated with 
an unpredicted behavior. This permanence makes the future values of the se- 
ries to be fuzzy and/or imprecise. The fuzzy regression was already applied 
as a privileged method for the estimation of uncertain and imprecise data. 
See (He et al, 2007), (Khashei et al, 2008), (Kim et al, 1996), (Sanchez et 
al, 2003), (Shapiro, 2005), (Terence, 1999), (Tseng et al, 1999), (Tseng et al, 
2001), (Watada, 1992), (Wu et al, 2002), (Zopoundis et al, 2001). 

In this section, a fuzzy regression model is applied to estimation the SP500 
index series. The model due to Watada 1992, is applied here. This model is 
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reviewed hereafter. It is based on the following fuzzy linear programming. 

MinS = So + si 
s.t 

Co + ciU - (1 - h){sQ + si\ti\) < Yi, 
' Co + ciU + (1 - h){so + si\ti\) > Yi, 
Sj > and si > 0, 
Vt, = 1,...,128. 

I. 

where 

- h is a standard threshold, hereafter applied for h = 0.5. 

- ttj = {cj, Sj), (j = 0, 1) is a triangular fuzzy number. 

- ti is the time variable. 

- Yi is the observed index value at the time ti, i = 1, J 128. 

The problem is resolved using the Software LING09 resulting in the following 
fuzzy coefficients (Triangular fuzzy numbers). 

ao = (6.887995,0) and ai = (0.01066521, 0.06912872). (2) 

As a result the lower and upper estimations of the index series is provides 
resulting in the following fuzzy regression equation. 

Yi = (6.887995, 0) + (0.01066521, 0.06912872) * 0, 5 * U. (3) 

The original series with its fuzzy estimation are shown in the Figure [2] follow- 
ing. 

We notice that although the fuzzy regression model takes into account the 
uncertain behavior of the information, it did not fits well the tendency of 
the series, and it assumes that a monotone behavior exists which means that 
it ignores the fiuctuations already characterizing the data. Besides, the error 
estimation is important resulting in the values 

MSE = 5.31016565 and RMSE = 2.30437967 

where 

MSE = Y,{Y,-Yi)Vl28 z = l,...,128, (4) 



RMSE = VMSE, (5) 
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Time, t 

Fig. 2. Original series and its fuzzy regression estimation 
and Yi is the estimated value of the index at the time t^; i = 1, 128. 

As a conclusion, the fuzzy regression has been proved to be incapable for a 
robust estimation with a least error for the series applied. It necessitates to be 
corrected to fit the fluctuations and then the random behavior of the series. 
So, an analysis permitting to localize these fluctuations is necessary. It consists 
of wavelet analysis which will be developed in the next section. 



4 Wavelet analysis of the series 

Wavelet analysis is always applied to show how the series is volatile, and then 
to detect eventual fluctuations, (Patick, 2005). Wavelet analysis permits also 
to represent the strongly fluctuated series without necessitating a knowledge 
of the explicit functional dependence. Such a capacity is of great role especially 
for financial time series where such a dependence is always unknown. 

We propose hereafter to conduct a wavelet analysis of the series due to the 
index SP500 in order to localize well the fluctuations of the series. A maximum 
level decomposition J = 6 is fixed allowing a decomposition or a projection on 
the approximation space Vg relatively to a Daubechies DBA multi-resolution 
analysis with Matlab? software. 

As a result the series S(t) is decomposed on the form 

^ = (^6, Du D2, D^, D4, D5, De) 

or equivalently, 

S = Di + D2 + D3 + D4 + D5 + De + A6 
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where Aq is the global form of S{t) at the level 6 called also the trend or 
tendency, and Di, D2, -D3, 1^4, and Dq are the detail components of S{t) 
obtained by projecting the series on the detail spaces Wi, W2, W3, W4, 
and Wq. These components are represented hereafter. 

a 

7.5 I 1 , , 1 ^ ^ 1 

7.4 - 




'0 20 40 60 80 100 120 140 

Fig. 3. Approximation Aq 




Fig. 4. Detail component Dq 



We notice easily from these figures the localizations of the fluctuations of the 
series. The component Aq shows the low frequency fluctuations. The compo- 
nents Di, i = 1,2,..., 6 represents the high frequency behavior. We remark 
that the series is more fluctuated at detail levels and D3 more than 
and Dq. The volatile aspect of the series is clearly observed from Di and D2. 
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Fig. 6. Detail component D4 
5 Hybrid estimation model 



As we have localized the fluctuations of the series, we propose to return to 
the fuzzy regression model and to conduct a correction on it consisting in 
re-developing a dynamic fuzzy regression taking into account both the fluctu- 
ations and the uncertain aspect of the series. Denote S{t) the flnancial time 
series due to the SP500 index introduced previously. The proposed hybrid 
model is described by the following steps. 

• Step 1: The wavelet decomposition of the series; {Di,D2,D2,,D4,D^,Dq,Aq) . 

• Step 2: Compute the localizations of the extremum points of each compo- 
nent Di] i = 1,2, 6. 
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Fig. 8. Detail component D2 

Step 3: Apply the fuzzy regression to estimate the restriction of the se- 
ries on each interval [tn{i),tn+i{i)] where tn{i), tn+i{i) are two consecutive 
extremum points for the component Df, i = 1,2,...,6. 
Step 4: For alH = 1, 2, 6, regroup the new series obtained on the whole 
time interval [J[tn{i),tn+ii'i)]- 



We remark easily that the proposed model fits the peace-wise monotonicity 
of the time series. On each interval, where the series is monotone the fuzzy 
regression is apphed with corresponding fuzzy numbers. The results due to 
this model are shown in following figures. 

As we see, the new estimation due to the hybrid model fits more the original 
series as the detail level decreases. Here, we stress the fact that Daubechies 
wavelets in the software Matlab? uses the frequency index —j contrarily to the 
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Fig. 9. Detail component Di 
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Fig. 10. Estimation relatively to Dq. 
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Fig. 11. Estimation relatively to -D5. 
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Fig. 12. Estimation relatively to -D4. 




Fig. 13. Estimation relatively to D-^. 




Fig. 14. Estimation relatively to D2. 
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Fig. 15. Estimation relatively to Di. 



theoretical way of wavelet basis definition which uses instead the index j. So, 
we seek here an increasing in the detail approximation as j decreases. Indeed, 
the estimation relatively to Dq is somehow abusive (See Figure [TOl) . This is 
due to the fact this component does not contain an important number of 
extremum points or fluctuations. The estimation becomes more efficient when 
using 1^5 (See Figure [IDl). Next, Figures [121 [131 [HI [15] show an increasing in 
the fitness between the original series and the hybrid model estimated one. 
This is due to the fact that the hybrid model follows well the fiuctuations of 
the series. To finish with this model, we provided in the following table the 
different error estimates corresponding to the details Di] i = 1, 2, 6. 




The model 


MSE 


RMSE 


Fuzzy Regression 


1.5380 


1.2401 


Hybrid with Dq 


0.4100061 


0.64032 


Hybrid with 


0.1506692 


0.380324 


Hybrid with 


0.0402139 


0.2006 


Hybrid with D-^ 


0.01133175 


0.1064507 


Hybrid with D2 


0.00261067 


0.0611 


Hybrid with Di 


0.00060889 


0.024675 



Table 2 

Error estimates 
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6 Conclusion 



In the present paper, a fuzzy regression estimation is applied to estimate fi- 
nancial time series. Such estimation is shown to be not efficient. It gives an 
estimation with affine boundaries to the series which did not follow the fluctu- 
ations well. As financial time series arc very volatile, a wavelet decomposition 
is applied next to localize the fluctuations and then to prepare to a more 
sophisticated fuzzy model taking into account the fluctuations. As a result, 
an hybrid model combining fuzzy regression and wavelet decomposition is de- 
veloped. Finally, the different models are tested on the well known financial 
time series of the SP500 index. The empirical tests show an efficiency of the 
hybrid model. We intend in the future to apply the hybrid method or modified 
versions for other time series and for prediction aims. 
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